Contents Modelling of Neo-Sumerian Ur III Economic Text Corpus
نویسنده
چکیده
This paper describes a system for processing economic documents written in the ancient Sumerian language. The system is application-oriented and takes advantage of the simplicity of ancient economy. We have developed an ontology for a selected branch of economic activities. We translate the documents into a meaning representation language by means of a semantic grammar. The meaning representation language is constructed in a way that allows us to handle massive ambiguity caused by: the specifics of the Sumerian writing system (signs’ polyvalence, lack of mid-word signs), our incomplete knowledge of the Sumerian language and frequent damages of documents. The system is augmented with the capability of processing documents whose parts describe concepts not included in the ontology and grammar. As an effect we obtain a structural description of the documents contents in the meaning representation language, ready to use in historical research.
منابع مشابه
Creating Tools for Morphological Analysis of Sumerian
Sumerian is a long-extinct language documented throughout the ancient Middle East, arguably the first language for which we have written evidence, and is a language isolate (i.e. no related languages have so far been identified). The Electronic Text Corpus of Sumerian Literature (ETCSL), based at the University of Oxford, aims to make accessible on the web over 350 literary works composed durin...
متن کاملUnsupervised Sumerian Personal Name Recognition
This paper describes an unsupervised named-entity recognition (NER) system to identify personal names in Sumerian cuneiform documents from the Ur III period. We are motivated by the needs of social and economic historians of that period to identify specific persons of importance and such historically relevant facts as can be discerned by the surviving texts. The work was confronted by the chall...
متن کاملEnhancing Sumerian Lemmatization by Unsupervised Named-Entity Recognition
Lemmatization for the Sumerian language, compared to the modern languages, is much more challenging due to that it is a long dead language, highly skilled language experts are extremely scarce and more and more Sumerian texts are coming out. This paper describes how our unsupervised Sumerian named-entity recognition (NER) system helps to improve the lemmatization of the Cuneiform Digital Librar...
متن کاملThe Cosmological Hoe
The hoe—the sound of the word is sweet...the hoe makes everything prosper, the hoe makes everything flourish. The hoe is good barley...the hoe is brick moulds, the hoe has made people exist. It is the hoe that is the strength of young manhood. The hoe and the basket are the tools for building cities. It builds the right kind of house, it cultivates the right kind of fields. It is you, hoe, that...
متن کاملOntology-Based Knowledge Discovery from Documents in Natural Language
A vast amount of knowledge is contained in large collections of unstructured or weakly structured text documents, which started emerging soon after discovery of writing. The utility of such document collections depends on the ability of nding the relevant information. Users seek not only for information localised in speci c documents but also knowledge spread across the whole document collectio...
متن کامل